few-shot segmentation
Feature-Proxy Transformer for Few-Shot Segmentation
Few-shot segmentation (FSS) aims at performing semantic segmentation on novel classes given a few annotated support samples. With a rethink of recent advances, we find that the current FSS framework has deviated far from the supervised segmentation framework: Given the deep features, FSS methods typically use an intricate decoder to perform sophisticated pixel-wise matching, while the supervised segmentation methods use a simple linear classification head. Due to the intricacy of the decoder and its matching pipeline, it is not easy to follow such an FSS framework. This paper revives the straightforward framework of "feature extractor + linear classification head" and proposes a novel Feature-Proxy Transformer (FPTrans) method, in which the "proxy" is the vector representing a semantic class in the linear classification head. FPTrans has two keypoints for learning discriminative features and representative proxies: 1) To better utilize the limited support samples, the feature extractor makes the query interact with the support features from bottom to top layers using a novel prompting strategy.
Mask Matching Transformer for Few-Shot Segmentation
In this paper, we aim to tackle the challenging few-shot segmentation task from a new perspective. Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results. However, to obtain satisfactory segments, such a paradigm needs to couple the learning of the matching operations with heavy segmentation modules, limiting the flexibility of design and increasing the learning complexity. To alleviate this issue, we propose Mask Matching Transformer (MM-Former), a new paradigm for the few-shot segmentation task. Specifically, MM-Former first uses a class-agnostic segmenter to decompose the query image into multiple segment proposals.
SingularValueFine-tuning: Few-shotSegmentation requiresFew-parametersFine-tuning-SupplementaryMaterial
Different finetune strategy: In Figure 1, we visualize the mIoU curve of different fine-tuning strategies. It can be seen that both layer-based and convolution-based fine-tuning methods bring over-fitting problems. This result shows that traditional fine-tuning methods are not suitable for few-shot segmentation tasks. Directly fine-tuning theparameters ofbackbone infew-shot learning affects the robustness ofFSS models. Therefore, we propose anovelfine-tuning strategy,namely SVF.